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REMARKS 



The comments of the applicant below are each preceded by related comments of the 
examiner (in small, bold type). 

Claims I, 4, 17, 18, 27, 28, and 30 recite the limitations "the historical data" and 
"the data". There is insufficient antecedent basis for these limitations in the amended claims. 
The amended claims refer to "the historical data", but recite only "the data" earlier in the 
claims. Also, the amended claims refer to "the data", but recite only "the historical data" 
earlier in the claims. That is, "the historical data" and "the data" may (if used 
interchangeably) or may not refer to the same dataset If "the historical data" and "the 
data" refer to the same dataset (if used interchangeably), Applicants are directed to not use 
different terms to mean the same object. 

The claims have been amended. 

Claim 30 claims "the propensity computed by the model". A description of a model 
computing propensity is non-existent in the application description, and this may raise 
indefiniteness issues. 

The applicant disagrees. The description of a model computing propensity is 
provided, for example, in claim 30 as filed, or specification, for example, on page 22, 
lines 12-14. 

Claims 1-1 1, 13-19, 22, 23, and 25-30 are rejected under 35 U.S.C. 112, second 
paragraph, as being indefinite for failing to particularly point out and distinctly claim the 
subject matter which applicant regards as the invention. 

Claim 1, as amended, fails to perform the method set forth in the limitation "in 
connection with a project in which a user generates a predictive model based on historical 
data about a system being modeled". In the body of the claim which follows this limitation, 
no predictive model is generated. This limitation is dangling. In the amended claim, 
Examiner is unclear if this limitation is part of the preamble or the body of the claim 
because of the amended colon which follows "method comprising". See colons in claim I 
lines I and 3. Examiner interprets this limitation as part of the body of the claim for 
examination purposes. 

Claim 1 recites a machine-based method comprising limitations which generate a 
possible model and a final model. However, the limitation "in connection with a project in 
which a user generates a predictive model based on historical data about a system being 
modeled" has a user generating a predictive model. Which one is it? Is a machine to 
generate a model(s) or a user to generate his model? Examiner is unsure about the scope of 
the claims. 

Claims 34-40 are rejected under 35 U.S.C. 112, second paragraph, as being 
incomplete for omitting essential steps, such omission amounting to a gap between the steps. 
See MPEP § 2172.01. In claim 34, as amended, both limitations (receiving potential predictor 
and dependent variables representing historical data and model generation-combination) 
are dangling. Both limitations are disconnected. 

The claims have been amended. 

Claims 1-11, 13-1 9, 22, 23, 25-30, and 34-40, are rejected under 35 U.S.C. 103(a) as 
being unpatentable over by Cabena et al., (Cabena hereinafter), Intelligent Miner for Data 
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Applications Guide (see EDS dated 1211 8/06), taken in view of Harrison, (Harrison 
hereinafter), An Intelligent Business Forecasting System (see IDS dated 1211 8/06), and 
further in view of applicant's admission of prior art (AAPA hereinafter) as specified in IDS 
dated 05/27/05. 

As to claim 1, Cabena discloses a machine-based method comprising: in connection 
with a project in which a user generates a predictive model based on historical data about a 
system being modeled (see chapter 1.5.1, Pages 9-1 1): selecting variables having at least a 
first predetermined level of significance from a pool of potential predictor variables 
associated with the data, to form a first population of predictor variables (see page 101, 2" 
and 3rd paragraphs), extending the first population of predictor variables (see page 93, 2" 
paragraph) and extending the first population of predictor variables (see "supplementary 
variables" in "All other discrete and categorical variables and some interesting continuous 
variables were input as supplementary variables to be profiled with the clusters but not used 
to define them. 

These supplementary variables can be used to interpret the cluster as well. The 
abiUty to add supplementary variables at the outset of clustering is a very useful feature of 
Intelligent Miner, which allows the direct interpretation of clusters using other data very 
quickly and easily" in page 48, 1 st , paragraph), generating a possible model of the third 
population of predictor variables using a subsample of the historical data by the model 
generation method (see "Feature Selection" and "Train and Test" in page 95), determining 
whether the possible model generalizes to the historical data other than the subsample (see 
page 101, last paragraph), applying the possible model to all of the historical data to 
generate a final model, cross-validating the final model using random portions of the 
historical data (see page 97, last paragraph), and interacting with the system being modeled 
based on the final model (see "To ensure that the model has not overlit the data and to assess 
the model performance against a data set that has the same characteristics as the application 
universe, the model should be executed against the test data in test mode" in page 102, 1 st 
paragraph, lines 1-5 and "After having iteratively improved the models, you chose the best 
model" in page 102, 3rd paragraph, line 1 ). 

While Cabena discloses generating a predictive model based on historical data 
about a system being modeled, Cabena fails to disclose automatically selecting a model 
generation method from a set of available model generation methods to match 
characteristics of the historical data. 

Harrison discloses automatically selecting a model generation method from a set of 
available model generation methods to match characteristics of the historical data (see page 
233, col. 2, next to last paragraph, last 7 lines). 

AAPA discloses including cross products of at least two variables, each being from 
the first population of predictor variables, selecting variables having at least a second 
predetermined level of significance from the extended first population of predictor variables 
to form a second population of predictor variables, extending the second population of 
predictor variables to include cross products of at least two variables, at least one of the 
variables being from the first population of predictor variables and having less than the first 
predetermined level of significance, selecting variables having at least a third predetermined 
level of significance from the extended second population of predictor variables to form a 
third population of predictor variables. 

Cabena and Harrison are analogous art because they are both related to predictive 
modelling. 

Therefore, it would have been obvious to one of ordinary skill in this art at the time 
of invention by applicant to utilize the automatic model selection of Harrison in the method 
of Cabena because Harrison explore the possibility of the integration of expert systems 
technology with a forecasting decision support system (see page 229, col. 1, lines 1-4), and as 
a result, Harrison reports that testing of his prototype shows that the system is useful for 
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managers who have no forecasting technique and computing background and want to 
improve their decision making by means of quantitative forecasting (see page 235, col. 2, 
next to last paragraph). 

The rejection of claim 1 based on Cabena, Harrison, and AAPA is clearly wrong, because 
not one of the references, alone or in combination, described or would have made obvious any of 
the following five claimed actions used in generating a predictive model: 

1 . Selecting variables having at least a first predetermined level of significance from a 
pool of predictor variables to form a first population of predictor variables. 

2. Extending the first population of predictor variables to include cross products of the 
variables in the first population. 

3. Selecting variables in the extended first population that have at least a second 
predetermined level of significance to form a second population of predictor variables. 

4. Selecting at least one variable having less than the first predetermined level of 
significance from the pool of predictor variables (recited in action 1) to produce cross products. 
The cross products are included in the second population to form an extended second population 
of variables. 

5. Selecting variables in the extended second population and having at least a third 
predetermined level of significance to form a third population of predictor variables. 

Cabena described none of the five actions. Contrary to the examiner's assertion that 
Cabena described the first action in the second and third paragraphs of page 101, those parts of 
Cabena described visualization of the modeling results , not selecting variables for generating a 
predictive model. In addition, although Cabena did describe determining "whether there are 
strong variables, . . . split the data into multiple files . . .", Cabena did not "select variables having 
at least a first predetermined level of significance", but instead ran "RBF against each of the 
separate files. . ." (third paragraph, page 101). In other words, although Cabena may have 
identified whether there are strong variables, he used all variables for display and no selection 
was done. 

Also contrary to the examiner's assertion that Cabena partially described the second 
action by expanding his population of variables using supplementary variables, Cabena' s 
supplementary variables are only used for profiling, not for defining the clusters (first paragraph, 



Applicant 
Serial No. 
Filed 
Page 



Stephen K. Pinto et al. 
10/826,630 
April 16, 2004 
12 of 16 



Attorney ' s Docket No. : 17 1 46-000700 1 



page 48). Cabena referred to: ". . . supplementary variables to aid in the interpretation of the 
neural cluster" (last paragraph, page 52), "... supplementary variables, which are profiled by 
model region but not used to build the model" (second paragraph, page 100), and "input all data 
as supplementary not used in the prediction" (last paragraph, page 1 1 8). Cabena did not expand 
the population of predictor variables for use in generating a predictive model. 

The examiner conceded that Cabena did not describe the other three actions of claim 1 
listed above. But the examiner alleged that AAPA described "cross products" and — in 
combination with Cabena — described the second action referred to above. And the examiner 
contended that AAPA described actions 3 through 5. 

AAPA did not describe and would not have made obvious actions 3 through 5. With 
regard to action 3, although AAPA constructed new attributes from cross products of existing 
attributes and extended the population of the attributes (page 50, "Derived Attributes"), AAPA 
did not select "variables having at least a second predetermined level of significance from the 
extended first population of predictor variables" after including the cross products. 

With regard to action 4, AAPA did not describe subsequent actions of including more 
cross products of variables into "the second population of predictor variables" after previous 
actions of including cross products. AAPA's general description of data construction using cross 
products would not have made obvious the multi-action variable selection and predictor variable 
population extension of claim 1 . In addition, AAPA did not describe and would not have made 
obvious that at least one of the variables producing the cross product has "less than the first 
predetermined level of significance", also recited in action 3. AAPA listed data to be used or 
excluded based on selection criteria (page 48, "Output"), but said nothing about whether those 
data excluded would be used for constructing cross products (page 50, "Output"). 

AAPA also did not select "variables having at least a third predetermined level of 
significance from the extended second population of predictor variables" after including the 
cross products. AAPA said nothing about selecting variables after including cross products. 

Harrison had nothing to do with the actions of selecting variables and extending a 
population of predictor variables for generating a predictive model, recited by claim 1 . 

Accordingly, the combination of Cabena, Harrison, and AAPA would not have made 
obvious the features of claim 1. 
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The rejection of claim 34 based on Cabena, Harrison, and AAPA is also clearly wrong, at 
least because none of the references, alone or in combination, described or would have made 
obvious combining two models "based on response propensities of each model" in order to 
create cross-modal deciles, let alone based on "weaving of the historical data" to provide cross- 
modal optimization or concatenating the predictions of the two models. 

The examiner conceded that Cabena had nothing to do with combining models. The 
examiner also conceded that although Harrison described combining models, Harrison did not 
describe and would not have made obvious combining two models "based on response 
propensities of each model" in order to create cross-modal deciles, let alone based on "weaving 
of the historical data" to provide cross-modal optimization or concatenating the predictions of 
the two models. 

The examiner alleged that AAPA described combining based on response propensities in 
order to create cross-modal deciles and based on data weaving to provide cross-modal 
optimization. However, the examiner failed to point out which part of AAPA described such 
features. In fact, AAPA had nothing to do with combining models. Nothing was combined 
based on "response propensities," and no data was woven. AAPA said nothing about "to create 
cross-modal deciles" or "to provide cross-modal optimization". AAPA mentioned "propensity" 
(page 41 "Data mining success criteria"), but only as a selected criteria for determining success 
of a project. 

Accordingly, none of Cabena, Harrison, and AAPA, alone or in combination, described 
or would have made obvious the features of claim 34. 

The dependent claims are patentable over the cited references, for at least the same 
reasons discussed with respect to independent claims 1 and 34, from which they depend. 

Claims 31-33 are rejected under 35 U.S.C. 103(a) as being unpatentable over by 
Cabena taken in view of Harrison, and further in view of Galperin et al., (Galperin 
hereinafter), U.S. Patent 6845215 (see IDS dated 2/28/05). 

As to claim 31, Cabena discloses a machine-based method comprising in connection 
with a project, generating a predictive model based on the historical data (see chapter 1.5.1, 
Pages 9-1 1), and displaying to a user a lift chart (see page 101, last paragraph, lines 1-5 and 
page 105, 1" and 2" paragraphs), monotonicity (see page 101, last paragraph, last 3 lines and 
page 11 9, 2" bullet from the bottom), and concordance scores (see Chapter 1.5.1, Pages 9-1 1 
) associated with each step in a step-wise model fitting process (see page 98, 2" paragraph). 
While Cabena discloses generating a predictive model based on historical data about a 
system being modeled, 
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Cabena fails to disclose automatically selecting a model generation method from a 
set of available model generation methods to match characteristics of the historical data. 
Harrison discloses automatically selecting a model generation method from a set of available 
model generation methods to match characteristics of the historical data about a system 
being modeled (see page 233, col. 2, next to last paragraph, last 7 lines). 

Cabena nor Harrison do not expressly disclose displaying to a user concordance 

scores. 

Such feature is however well-known in the art. Examiner notes that the claims 
reciting "concordance scores" were interpreted as "area under curve". 

In fact, Galperin teaches calculating concordance scores being obtained based on a 
receiver-operator-characteristic curve and indicating to the user goodness of fit of the 
historical data to the generated predictive model. (See "measures the integral criterion of lift 
within a range [XI, x2] (say, between 20% and 50%) calculated by the formula ... " in col. 3, 
lines 18-28 and col. 4, lines 8-27). 

Cabena, Harrison, and Galperin are analogous art because they are related to 
predictive modelling. 

Therefore, it would have been obvious to one of ordinary skill in this art at the time 
of invention by applicant to utilize the automatic model selection of Galperin in the Cabena- 
Harrison method because Galperin solves for lift, accomplishing the following advantages 
over existing commercial techniques: tuning to a predefined interval of a sorted customer list 
and using a variety of different modeling approaches (see col. 2, lines 13-40), and as a result, 
Galperin reports that by using his invention marketing analysts will be better able to: 
predict the propensity of individual prospects to respond to an offer; identify customers and 
prospects who are most likely to default on loans or prepay loans; identify customers who 
are most amenable to cross-sell and up-sell opportunities; predict claims experience, so that 
insurers can better establish risk and set premiums appropriately; and identify instances of 
credit-card fraud (see col. 2, lines 40-54) as well as that using his invention in conjunction 
with a neural network provides models for analyzing data to indicate the individuals or 
classes of individuals who are most likely to respond to targeted marketing (see col. 5, lines 
40-44). 

The rejection of claim 31 based on Cabena, Harrison, and Galperin is also clearly wrong, 
at least because none of the references, alone or in combination, described displaying to a user 
"concordance scores". 

The examiner interpreted "concordance scores" to be "area under curve". However, 
claim 3 1 explains that concordance scores are obtained based on "a receiver-operator- 
characteristic curve". 

The examiner conceded that Cabena and Harrison did not describe displaying to a user 
"concordance scores" but alleged that Galperin did. Galperin calculated integral criterion of lift 
to maximize the lift within a range (abstract, column 3, lines 18-20, and column 4, line 7), which 
had nothing to do with "concordance scores". As claim 31 explains, concordance scores indicate 
"to the user goodness of fit of the historical data to the generated predictive model"; while lift as 
a term of art in data mining, is a measure of the performance of a model in segmenting the 
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population. The lift of a subset of the population is defined as the predicted response rate for that 
subset divided by the predicted response rate for the population. For example, suppose a 
population has a predicted response rate of 5%, but a certain model has identified a segment with 
a predicted response rate of 20%. Then that segment would have a lift of 4.0 (20%/5%) 
( http://en.wikipedia.org/wiki/Lift_fdata mining)) . Lift or integral criterion of lift has nothing to 
do with concordance scores that indicate to the "user goodness of fit of the historical data to the 
generated predictive model". 

Accordingly, Galperin did not calculate concordance scores and did not display to a user 
"concordance scores". The combination of Cabena, Harrison, and Galperin did not describe and 
would not have made obvious the features of claim 31. 

The dependent claims are patentable over the cited references, for at least the same 
reasons discussed with respect to independent claim 31, from which they depend. 

All of the dependent claims are patentable for at least the reasons for which the claims on 
which they depend are patentable. 

Canceled claims, if any, have been canceled without prejudice or disclaimer. 
Any circumstance in which the applicant has (a) addressed certain comments of the examiner 
does not mean that the applicant concedes other comments of the examiner, (b) made arguments 
for the patentability of some claims does not mean that there are not other good reasons for 
patentability of those claims and other claims, or (c) amended or canceled a claim does not mean 
that the applicant concedes any of the examiner's positions with respect to that claim or other 
claims. 
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